The extended hunk-format (EHF) Author: Sam Jordan (15-May-97) With special consideration of assembler programming 1. Introduction The extended hunk-format was developed in order to be able to optimally integrate PPC-software into the Amiga's operating system. For this reason the existing hunk-format of the Amiga was fitted with several extensions in order to make it PPC-capable. All extensions apply to object files only. The format of executables has not changed at all. 2. New hunk-IDs HUNK_PPC_CODE = $4e9 This is equivalent to the existing HUNK_CODE the only difference is that a hunk with the HUNK_PPC_CODE ID contains PowerPC code. Any other HUNK_CODE can contain code for the PPC but StormLINK uses the ID to determine whether a hunk contains PPC code as PPC code often has to be treated differently from 68K code. HUNK_RELRELOC26 = $4ec This is in most parts equivalent to the existing HUNK_RELRELOC16 (same structure). The value to be corrected has a size of 4 bytes. However, only the least significant 26 bits may be corrected as the upper 6 bits are part of the opcode (which must not be overwritten). This hunk is used whenever the PowerPC-command 'bx' is used to branch into another code section (which is only possible in SmallCode-mode anyways). The 'bx' command is a branch-command with a displacement size of 26 bits. EXT_RELREF26 = 229 This is equivalent to the existing EXT_RELREF16. This means that this ID can only occur within a HUNK_EXT. The displacement to be corrected also has a size of 26 bits (also see HUNK_RELRELOC26). This ID is used whenever the PowerPC-command 'bx' is used to branch to an external address (only possible in SmallCode-mode). None of the IDs listed here will show up in executable programs. The only exception to this rule is HUNK_PPC_CODE which will be a valid ID in p.OS/PPC. 3. Data models Because all PowerPC commands have a length of 4 bytes it is not possible to absolutely address variables or memory. All memory accesses must be done relatively to a base. For variable accesses the base is the r2 register by definition. There are two data models: SmallData and LargeData. Please note, however, that even when using the LargeData model, memory access is handled relatively. Below you find a description of the two data models: SmallData: A program created in SmallData mode only contains one Data-BSS-Hunk. When the executable program is started the start address of the hunk is placed in r2. All variables can then be loaded/saved relative to this base. The SmallData hunk must be less than 64 Kbyte in size. LargeData: A program created in LargeData mode may contain an arbitrary amount of data- and BSS-hunks. In order to access variables an additional data-hunk is necessary (the so-called TOC-hunk). For each variable that exists in the program a pointer to his variable is placed in the TOC-hunk. When the executable program is started the start-address of the TOC-hunk is placed in r2. In order to load a variable the appropriate pointer must be read from the TOC-hunk first - after that the variable itself can be read. Access to variables is of course slower in LargeData-mode because two memory accesses are needed. 4. Access to variables in assembler In SmallData-mode it is NOT possible to access data in code-sections. For this reason it is highly recommended to not read any data from code-sections even when using the LargeData-mode. If for example the address of a PPC-function has to be determined this should be done by use of a help-pointer within a data-section. In assembler, the address of a variable is determined by use of the 'la'-command (extended mnemonic). Depending on the data model used, the la-command is assembled differently. The command la r3,label is in SmallData mode assembled to addi r3,r2,disp ;disp is the offset of 'label' from the hunk start and in LargeData mode assembled to lzw r3,disp(r2) ;disp if the offset of the pointer to 'label' >from the start address of the TOC-section. Because of this using the 'la' command is very convenient as the assembler takes care of the differences, not the programmer. The direct reading or writing of a variable is much more difficult. In SmallData-mode a simple 'lwz'-command can do this while in LargeData-mode two load-/save-commands are needed for this. Now it is absolutely in-acceptable for a programmer to choose one of the data models when starting to develop a program. So this should also be handled by the assembler. StormPowerASM supports a number of pseudo-mnemonics that automatically handle these kinds of differences. The commands can be implemented either as macros or as additional directives. It is very important that the syntax and effect of these pseudo-commands is implemented the exact same way by other assemblers! Below you find an overview over all pseudo-mnemonics for loading and storing a variable: lw rx,variable ;Load a longword-variable lh rx,variable ;Load a word-variable (unsigned) lhs rx,variable ;Load a word-variable (signed) lb rx,variable ;Load a byte-variable (unsigned) lbs rx,variable ;Load a byte-variable (signed) lf fx,variable ;Load a floating-point-variable (double) ls fx,variable ;Load a floating-point-variable (single) sw rx,variable ;Store a longword-variable sh rx,variable ;Store a word-variable sb rx,variable ;Store a byte-variable sf fx,variable ;Store a floating-point-variable (double) ss fx,variable ;Store a floating-point-variable (single) Important: The data register must not be r0. These pseudo-mnemonics fail in LargeData mode if r0 is used. Some examples for pseudo-mnemonics: section code lw r5,var1 ;r5 = $abcdabcd lh r5,var2 ;r5 = $00001234 lhs r5,var3 ;r5 = $fffffedc lf f3,fvar ;f3 = 3.141 ls f0,fvar2 ;f0 = 1.6666 sw r5,var4 ;$fffffedc is stored in 'var4' sb r5,var5 ;$dc is stored in 'var5' section data var1 dc.l $abcdabcd var2 dc.w $1234 var3 dc.w $fedc fvar dc.d 3.141 fvar2 dc.s 1.6666 var4 dc.l 0 var5 dc.b 0 |
|